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Abstract 

We study the problem of answering a workload of linear queries Q, on a database of size at most 
n = o(|Q|) drawn from a universe U under the constraint of (approximate) differential privacy. Nikolov, 
Talwar, and Zhang INTZ13| proposed an efficient mechanism that, for any given Q and n, answers the 
queries with average error that is at most a factor polynomial in log \ Q\ and log \U\ worse than the best 
possible. Here we improve on this guarantee and give a mechanism whose competitiveness ratio is at 
most polynomial in logn and log |ZY|, and has no dependence on |Q|. Our mechanism is based on the 
projection mechanism of |NTZ13j . but in place of an ad-hoc noise distribution, we use a distribution 
which is in a sense optimal for the projection mechanism, and analyze it using convex duality and the 
restricted invertibility principle. 


1 Introduction 

The central problem of private data analysis is to characterize to what extent it is possible to compute useful 
information from statistical data without compromising the privacy of the individuals represented in the 
dataset. In order to formulate this problem precisely, we need a database model and a definition of what it 
means to preserve privacy. Following prior work, we model a database as a multiset D of n elements from a 
universe IA, with each database element specifying the data of a single individual. Defining privacy is more 
subtle. A definition which has received considerable attention in recent years is differential privacy , which 
postulates that a randomized algorithm preserves privacy if its distribution on outputs is almost the same 
(in an appropriate metric) on any two input databases D and D' that differ in the data of at most a single 
individual. The formal definition is as follows: 

Definition 1.1 ( [DMNSOfin . Two databases D and D' are neighboring if the size of their symmetric differ¬ 
ence is at most one. A randomized algorithm A4 satisfies (e, (^-differential privacy if for any two neighboring 
databases D and D' and any measurable event S in the range of A 4 , 

P [M(D) GS}< e £ P [M(D') £ S] + 6. 

Differential privacy has a number of desirable properties: it is invariant under post-processing, the privacy 
loss degrades smoothly under (possibly adaptive) composition, and the privacy guarantees hold in the face 
of arbitrary side information. We will adopt it as our definition of choice in this paper. We will work in the 
regime S > 0, which is often called approximate differential privacy, to distinguish it from pure differential 
privacy, which is the case 5 = 0. Approximate differential privacy provides strong semantic guarantees when 
5 is roughly speaking, it implies that with probability at least 1 — 0(ny/d ), an arbitrarily informed 

adversary cannot guess from the output of the algorithm if any particular user is represented in the database. 
See [GKS081 for a precise formulation of this semantic guarantee. 

We then turn to the question of understanding the constraints imposed by privacy on the kinds of 
computation we can perform. We focus on computing answers to a fundamental class of database queries: 
the linear queries , which generalize counting queries. A counting query counts the number of database 
elements that satisfy a given predicate; a linear query allows for weighted counts. Formally, a linear query is 
specified by a function q: U —V M (q: U —> {0,1} in the case of counting queries); slightly abusing notation, 
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we define the value of the query as q{D) = J2eeD <z( e ) (elements of D are counted with multiplicity). We 
call a set Q of linear queries a workload , and an algorithm that answers a query workload a mechanism. 

Since the work of Dinur and Nissim [DN03I . it has been known that answering queries too accurately 
can lead to very dramatic privacy breaches, and this is true even for counting queries. For example, 
in |DN03l 1DMT07) it was shown that answering Sl(n) random counting queries with error per query o(y / n) 
allows an adversary to reconstruct a very accurate representation of a database of size n, which contra¬ 
dicts any reasonable privacy notion. On the other hand, a simple mechanism that adds independent 
Gaussian noise to each query answer achieves (e, <5)-differential privacy and answers any set Q of count¬ 
ing queries with average error 0(y/\Q\) [ DN031IDN04I IDMNS06] F1 While this is a useful guarantee for a 
small number of queries, it quickly loses value when \Q\ is much larger than the database size, and be¬ 
comes trivial for w(n 1 2 ) queries. Nevetheless, since the seminal paper of Blum, Ligett and Roth |BLR08| . a 
long line of work [DNR+091IDRV101IRR101IHR101IGHRU1H IHLM121IGRU12] has shown that even when 
|Q| = ca(n), more sophisticated private mechanisms can achieve error not much larger than 0(y/n). For 
instance, there exist (e, <5)-differentially private mechanisms for linear queries that acheive average error 
0(y/n log 1//4 \U\) [GR.U12] , There are sets of counting queries for which this bound is tight up to factors 
poly logarithmic in the size of the database |BUV13I . 

Specific query workloads allow for error which is much better than the worst-case bounds. Some nat¬ 
ural examples are queries counting the number of points in a line interval or a d-dimensional axis-aligned 
box [DNPR.lOl IGSSlOl IXWG10] . or a d-dimensional halfspace |MN12j . It is, therefore, desirable to have 
mechanisms whose error bounds adapt both to the query workload and to the database size. In particu¬ 
lar, if opt(n, Q ) is the best possible average erroi@ achievable under differential privacy for the workload 
Q on databases of size at most n, we would like to have a mechanism with error at most a small fac¬ 
tor larger than opt(n, Q) for any n and Q. The first result of this type is due to Nikolov, Talwar, and 
Zhang [NTZ13j . who presented a mechanism running in time polynomial in \U\, \Q\, and n, with error at 
most polylog(|<2|, \U\) ■ opt(n, Q). _ 

Here we improve the results from INTZ 1 31 : 

Theorem 1.1 (Informal). There exists a mechanism that, given a database of size n drawn from a universe 
U, and a workload Q of linear queries, runs in time polynomial in \U\, |Q| and n, and has average error per 
query at most polylog(n, \U\) ■ opt(n, Q). 

Notice that the competitiveness ratio in Theorem 1 1.1 1 is independent of the number of queries, which can 
be significantly larger than both n and \U\. This type of guarantee is easier to prove when n = I2(|Q|), 
because in that case there exist nearly optimal mechanisms that are oblivious of the database size jNTZ13j . 
Therefore, we focus on the more challenging regime of small databases, i.e. n = o(|Q|). 

It is worth making a couple of remarks about the strength of Theorem 11.11 First, in many applications 
the queries in Q are represented compactly rather than by a truth table, and \IA\ is exponentially large in 
the size of a natural representation of the input. In such cases, running time bounds which are polynomial 
in \U\ may be prohibitive. Nevertheless, our work still gives interesting information theoretic bounds on the 
optimal error, and, moreover, our mechanism can be a starting point for developing more efficient variants. 
Furthermore, under a plausible complexity theoretic hypothesis, our running time guarantee is the best one 
can hope for without making further assumptions on Q IU1113I . A second remark is that our optimal error 
guarantees are in terms of average error, while many papers in the literature consider worst-case error. 
Proving a result analogous to Theorem 11.11 for worst-case error remains an interesting open problem. 

Techniques. Following the ideas of INTZ13] . our starting point is a generalization of the well-known 
Gaussian noise mechanism, which adds appropriately scaled correlated Gaussian noise to the queries. By 
itself, this mechanism is sufficient to guarantee privacy, but its error is too large when n = o(|Q|). The 
main insight of INTZ 131 was to use the knowledge that the database is small to reduce the error via a post¬ 
processing step. The post-processing is a form of regression: we find the vector of answers that is closest to 
the noisy answers while still consistent with the database size bound. (In fact the estimator is slightly more 
complicated and related to the hybrid estimator of Zhang [Zhal3] .l Intuitively, when n is small compared 
to the number of queries, this regression step cancels a significant fraction of the error. 

1 Here and in the remainder of the introduction we ignore dependence of the error on e and <5. 

2 We give a formal definition later. 
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Our first novel contribution is to analyze the error of this mechanism for arbitrary noise distributions 
and formulate it as a convex function of the covariance matrix of the noise. Then we write a convex program 
that captures the problem of finding the covariance matrix for which the performance of the mechanism is 
optimized on the given query workload and database size bound. We use Gaussian noise with this optimal 
covariance in place of the recursively constructed ad-hoc noise distributior@ from [NTZ13 . Finally, we 
relate the dual of the convex program to a spectral lower bound on opt(n, Q) via the restricted invertibility 
principle of Bourgain and Tzafriri [BT87| . We stress that while the restricted invertibility principle was used 
in |NTZ13| as well, here we need a new argument which works for the optimal covariance matrix we compute 
and gives a smaller competitiveness ratio. 

In addition to the improvement in the competitiveness ratio, our approach here is more direct and we 
believe it gives a better understanding of the performance of the regression-based mechanism for small 
databases. 

2 Preliminaries 

We use capital letters for matrices and lower-case letters for vectors and scalars. We use (-, •) for the standard 
inner product between vectors in R ra . For a matrix M £ K mxrl anc j a set SC [n], we use Ms for the submatrix 
consisting of the columns of A indexed by elements of S. We use the notation M >- 0 to denote that M is 
a positive definite matrix, and M >z 0 to denote that it is positive semidefmite. We use for the 

smallest singular value of M, i.e. cr m in(-M) = min x ||Wfcc|| 2 /1|a^|| 2 - We use tr(-) for the trace operator, and 
||M|| 2 for the £2 —> £2 operator norm of M, i.e. ||M || 2 = max x ||Mx|| 2 /||a;|| 2 . 

The distribution of a multivariate Gaussian with mean p, and covariance E is denoted N(p, E). 

2.1 Histograms, the Query Matrix, and the Sensitivity Polytope 

It will be convenient to encode the problem of releasing answers to linear queries using linear-algebraic 
notation. A common and very useful representation of a database D is the histogram representation : the 
histogram of D is a vector x £ R w such that for any e £U 1 x e is equal to the number of copies of e in D. 
Notice that ||.t||i = n and also that if x and x' are respectively the histograms of two neighboring databases 
D and D' , then ||x — cc'|| 1 < 1 (here ||cc||i = ]T e \x e \ is the standard £1 norm). Linear queries are a linear 
transformation of x. More concretely, let us define the query matrix A £ R QxW associated with a set of 
linear queries Q by a q , e = q(e). Then it is easy to see that the vector Ax gives the answers to the queries Q 
on a database D with histogram x. 

Since this does not lead to any loss in generality, for the remainder of this chapter we will assume that 
databases are given to mechanisms as histograms, and workloads of linear queries are given as query matrices. 
We will identify the space of size-n databases with histograms in the scaled £\ ball nB™ = {x £ R w : ||x||i < 
n}, and we will identify neighboring databases with histograms x,x' such that \\x — x'\\i < 1. 

The sensitivity polytope Ka of a query matrix A £ R® xW is the convex hull of the columns of A and the 
columns of — A. Equivalently, Ka = AB 1 ^ , i.e. the image of the unit £1 ball in under multiplication by A. 
Notice that uKa = {Ax : ||x||i < n} is the symmetric convex hui{] of the possible vectors of query answers 
to the queries in Q on databases of size at most n. 

2.2 Measures of Error and the Spectral Lower Bound 

As our basic notion of error we will consider mean squared error. For a mechanism At and a subset X C , 
let us define the error with respect to the query matrix A £ R QxW as 

eir(M,X,A) = sup ( E-^-\\Ax - M(A, x)\\l 
xgx\ | y | 

3 The distribution in INTZ13| is independent of the database size bound. This could be a reason why their guarantees scale 
with log \ Q\ rather than logn. 

4 The symmetric convex hull of a set of points , ujv is equal to the convex hull of d=ui,..., ±r>jv- 
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where the expectation is taken over the random coins of A4. We also write err(A i,nB^, A) as err(A i,n, A). 
The optimal error achievable by any (e, d)-differentially private mechanism for the query matrix A and 
databases of size up to n is 

opt £ g (n, A) = inf err (M,n, A), 

’ M 

where the infimum is taken over all (e, d)-differentially private mechanisms M. 

Arguing directly about opt £(5 (n, A) appears difficult. For this reason we use the following spectral lower 
bound from INTZ13| . This lower bound was implicit in previous papers, for example [KRSUIO] . 

Theorem 2.1 ( lNTZ13] b There exists a constant c such that for any query matrix A £ R^ xW , any small 
enough e, and any S small enough with respect to e, opt £(5 (n,A) > (c/e) SpecLB(en, A), where 

SpecLB (k,A) = max \Jk/\Q\ cr min (As). 

isr<fe 


2.3 Composition and the Gaussian Mechanism 

An important basic property of differential privacy is that the privacy guarantees degrade smoothly under 
composition and are not affected by post-processing. 

Lemma 2.1.1 QDMNS06I lDKM+06p . Let Ali(-) satisfy (e\, 8\)-differential privacy, and At 2 (a:, •) satisfy 
(£2, 82)-differential privacy for any fixed x. Then the mechanism M.2 (M.i(D), D) satisfies (e 1 + £2, 8± + 82)- 
differential privacy. 


A basic method to achieve (e, ^-differential privacy is the Gaussian mechanism. We use the following 
generalized variant, introduced in }NTZI3j . 


Theorem 2.2 (' |DN03I IDN041IDMNS061 INTZ13| b Let Q be a set of queries with query matrix A, and let 
£ £ £ >- 0, be such that a^£ _1 a e < 1 for all columns a e of A. Then the mechanism A is(A,x) = 

Ax + w where w ~ N( 0, c £ (5 £) and c Ei s — sa fa s fi es ( £; 8)-differential privacy. 


3 The Projection Mechanism 

A key element in our mechanism is the use of least squares estimation to reduce error on small databases. In 
this section we introduce and analyze a mechanism based on least squares estimation, similar to the hybrid 
estimator of |Zhal3j . Essentially the same mechanism was used in INTZ13] . but the definition and analysis 
were tied to a particular noise distribution. 


Algorithm 1 Projection Mechanism Af^ ° J 

Input: (Public) Query matrix A £ R® xW ; matrix £ >- 0 such that aj£ _1 a e < for all columns a e of A. 
Input: (Private) Histogram a; of a database of size ||x||i < n. 

1: Run the generalized Gaussian mechanism fTheorem l2.2[) to compute y = A1e(A,x); 

2: Let n be the orthogonal projection operator onto the span of the eigenvectors corresponding to the 
largest eigenvalues of £ 

3: Compute y £ n(I — n)AAi, where Ka is the sensitivity polytope of A, and y is 

V = argmin{|| 2 : - (/ - U)y\\l : z £ n(I - H)K A }. 

Output: Vector of answers Ay + y. 


As shown in INTZ131IDNT14) , Algorithm |T| can be efficiently implemented using the ellipsoid algorithm 
or the Frank-Wolfe algorithm. 

To analyze the error of the Projection Mechanism, we use the following key lemma, which appears 
to be standard in statistics (we refer to [NTZ131 IDNT14I for a proof). Recall that for a convex body 
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(compact convex set with non-empty interior) L C R m , the Minkowski norm ( gauge function ) is defined 
by HzHl = min{r : x £ rL} for any x £ R m . The polar body is L° = {y : (y,x) < 1 Vx £ L} and the 
corresponding norm is also equal to the support function of L: ||2/||l° — max{(y,a:) : x £ L}. When L is 
symmetric around 0 (i.e. — L = L ), the Minkowski norm and support function are both norms in the usual 
sense. 

Lemma 3.0.1 QNTZ13I lONTH) !. Let L C R m be a symmetric convex body, and let y £ L,y £ t m . Let, 
finally, y = &rgmm{\\z - y\\j : z£L}. We have \\y - y\\ 2 2 < 4min{||y - y\\\, \\y-y\\ L °}. 

The next lemma gives our analysis of the error of the Projection Mechanism. 

Lemma 3.0.2. A ssume E >- 0 is such that aJE 1 a e < 1 for all columns a e of A. Then the Projection 
Mechanism A4^f OJ in Algorithm\l\ is (e, S)-differentially private. Moreover, for e = 0(1), 


err(A4 


proj 



A/log 1/5 ' J 



1/2 


where <j\ > cr 2 > ... > <T|q| are the eigenvalues ofT,. 

Proof. To prove the privacy guarantee, observe that the output of A i^° 3 (A,x) is just a post-processing of 
the output of My,(A,x), i.e. the algorithm does not access x except to pass it to My.(A,x). The privacy 
then follows from Theorem 12.21 and Lemma [2.1.11 

Next we bound the error. Let y = Ax be the true answers to the queries, and let w = y — y ~ iV(0, c 2 e 5 E) 
be the random noise introduced by the generalized Gaussian mechanism. By the Pythagorean theorem and 
linearity of expectation, the expected total squared error of the projection mechanism is 

nm+y- z/iii = Eiinjf - ml+ e ii v - v - ^vwl 

Above and in the remainder of the proof expectations are taken with respect to the randomness in the choice 
of w. We bound the two terms on the right hand side separately. We will show: 

k 

EHlIy-nyll^c^^, (1) 

i =1 

\viog V<v “t 

© and © together imply the error bound in the theorem. 

To prove ©, observe that Ily — Ily = IIu; ~ N(0, c 2 e 5 IIEn). By the definition of II, the non-zero 
eigenvalues of IIEn are ,..., a*, where k = [en\. We have 

k 

E||ny - ml = Ce,5tr(nEn) = °i- 

i—l 

To prove © we appeal to Lemma©H© Define K = (J — n)Abi. With nK in the place of L, the lemma 
implies that 

E||j/-(J-n)i/||i<4E||(J-nH (T ^ )0 =4nE||(J-nHl^c, (3) 

where we used the simple fact 


||(J-n)w ;|| (n ^ )0 = sup^ {(I-U)w,z) = n sup {(I - n)w, z) = n\\ (7 - n)w||^ 0 . 

z£nK z(fK 


I\ is the convex hull of the columns of (7 — II) A and the columns of —(7 — II) A. 
(7 — II)a e we have 


1 > aJE-V > aj (7 - n)E " 1 (7 - n)a e > <rj^ 1 aj (7 - H)a e . 


For any such column 
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The first inequality is by the assumption on E; the second follows because E _1 — (7 — II)E _1 (7 — II) y 0; 
the third inequality is due to the fact that the smallest eigenvalue of (7 — II)E _1 (7 — II) restricted to the 
range of 7 — II is crfo), by the choice of II. Therefore, ||(7 — n)cz. e ||| < cr fc+ i < a k . Since a linear functional 
attains its maximum value over a polytope at a vertex, we have || (7 — n)u>|| ^„ = sup zg ^- ((/ — II)w;, z) = 
max e g^ |((7 — II)w,a e )|. Each inner product ((/ — II)w,a e ) is a centered Gaussian random variable with 
variance E((/ — h^ic, a e ) 2 = c 2 s al(I — II)E(7 — II)a e . By the choice of II, the largest eigenvalue of (7 — 
II)E(/ — II) is fjfc+i < tJfe. From this fact and the inequality ||(7 — II)a. e HI < cpc, we have that the variance 
of ((/ — II)u), a e ) is at most c 2 s a 2 . By a standard concentration argument, we can bound the expectation of 
the maximum absolute value of the inner products as 

E||(7- 11)^11^0 = Emax|((7 - n)w,a e )| = 0(Vlog \U\)c e ja k . 

e£U 

Plugging this into (O, we get 

E||y - (7 - U)y\\ 2 2 = 0(\J log \U\)c £ , s na k . 

To show that this implies |2), observe that, by averaging, c e ^na k < - £ ^-X)a=i <T i- Since k = \en\, = 

o(^-y^d=Pj . This finishes the proof of J2]). and, therefore, of the theorem. □ 

4 Optimality of the Projection Mechanism 

In this section we show that we can choose a covariance matrix E so that has nearly optimal error: 

Theorem 4.1. Let e be a small enough constant and let S = be small enough with respect to e. For 

any query matrix A £ R. SxW , and any database size bound n, there exists a covariance matrix E >- 0 such 
that the Projection Mechanism in Algorithm [7] is (e, S )-differentially private and has error 

err(Af,n,^4) = 0((logn)(log l/J) 1//4 (log IT/I) 1 / 4 ) • - SpecLB(en, A) 

£ 

= 0((logn)(log 1/5) 1/4 (log |7/|) 1/4 ) -opt e 5 (n,A) 

Moreover, E can be computed in time polynomial in \Q\. 

Theorem l4.1l is the formal statement of Theorem ll.il (Recall again that Algorithm [1] can be implemented 
in time polynomial in n, |Q| and \U\, as shown in [i NTZ13i IDNT 141 .1 

To prove the theorem, we optimize over the choices of E that ensure (e, ^-differential privacy, and use 
convex duality and the restricted invertibility principle to relate the optimal covariance to the spectral lower 
bound. 

4.1 Minimizing the Ky Fan Norm 

Recall that for an m x m matrix E >- 0 with eigenvalues 0 i >...>... > cr m , and a positive integer k < m, 
the Ky Fan fc-norm is defined as ||E||(fe) = + ... + a k . The covariance matrix E we use in the projection 

mechanism will be the one achieving min{||S||: aJE _1 a e < 1 Ve £ U }, where a e is the column of the 
query matrix A associated with the universe element e. This choice is directly motivated by Lemma |3. 0.21 
We can write this optimization problem in the following way. 

Minimize ||X -1 ||(*,) s.t. (4) 

X >~ 0 (5) 

Ve £ U : alXa e < 1. (6) 

The program above has a geometric meaning. For a positive definite matrix X, the set E{X) ={»£ : 

v T Xv} is an ellipsoid centered at the origin. The constraint © means that E(X) has to contain all columns 
of the query matrix A. The objective function © is equal to the sum of squared lengths of the k longest 
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major axes of E(X). Therefore, we are looking for the smallest ellipsoid centered at the origin that contains 
the columns of A , where the “size” of the ellipsoid is the sum of squared lengths of the k longest major axes. 
We will not use this geometric interpretation in the rest of the paper. 

We will show that © © is a convex optimization problem. This will allow us to use general tools such 
as the ellipsoid method to find an optimal solution, and also to use duality theory in order to analyze the 
value of the optimal solution. 

To show that © © is convex we will need the following well-known result of Fan. 

Lemma 4.1.1 ( |Fan49| 1 ■ For any m x m real symmetric matrix E, 

||E||( fc ) = max tr(£/ T E£/). 
uem mxk :irru=i 

With this result in hand, we can prove that © © is a convex optimization problem. 

Lemma 4.1.2. The objective function © and constraints © are convex over X >- 0. 

Proof. The objective function and the constraints © are affine, and therefore convex. It remains to show that 
the objective © is also convex. Let A'i and X 2 be two feasible solutions and define Y = aXi + (1 — a)X 2 for 
some a £ [0,1]. Because the matrix inverse is operator convex (see e.g. lBha97l f. Y~ x ~< aXf 1 + (1 — ajXj -1 . 
Let U £ R mxfc be such that tr(U T Y~ 1 U ) = ||F _1 ||( fe ) and U T U = I. Such a U exists by by Lemma [4.1.11 
We have, again using Lemma T4. 1.1 1 

||F _1 || (fc) = trtt/Ty 1 !/) < atr (IFX^U) + (1 - a)ti(U J X^U) 

< a\\X l 1 ||( fe ) + (1 - a)\\X 2 1 ||(fc)- 


This finishes the proof. □ 

Since the program © © is convex, its optimal solution can be approximated in polynomial time within 
any given degree of accuracy using the ellipsoid algorithm [ GLS8F . 


4.2 A Special Function 

Before we continue, we need to introduce a somewhat complicated function of the singular values of a matrix. 
This function will turn out to be the objective funciton in a maximization problem which is dual to © ©. 
The next lemma is needed to argue that this function is well-defined. The lemma was proved in |Nikl5| . 

Lemma 4.1.3 HNikl5] j. Let <7i > ... a m > 0 be non-negative reals, and let k < m be a positive integer. 
There exists a unique integer t, 0 < t < k — 1, such that 


cr t > 


a i 

k — t 


> 0t+1, 


(7) 


with the convention op = 00 . 

We are now ready to define the function: 

Definition 4.1. Let E >z 0 be an m x m positive semidefinite matrix with singular values <r\ '>■■■'> u m , 
and let k < m be a positive integer. The function hk (E) is defined as 

t ( \ 1/2 

m s ) -^2 a l /2 +^k-t () 

i=l \i>t / 


where t is the unique integer such that at > > crt+i. 

Lemma 14.1.31 guarantees that hk( E) is a well-defined real-valued function. In the next lemma we also 
show that it is continuous. 
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Lemma 4.1.4. The function hk is continuous over positive semidefinite matrices with respect to the operator 
norm. 

Proof. Let E be a to x to positive semidefinite matrix with singular values a± > ... > a m and let t, 0 < t < k, 
be the unique integer for which <r t > > crt+i- If > &t+ 1 , then setting 5 small enough ensures 

that, for any E' such that ||E — E '||2 < S, hk( E) and hk( E 7 ) are computed with the same value of t. In this 
case, the proof of continuity follows from the continuity of the square root function. Let us therefore assume 
that = cr 4+ i = ... = at> > <Jt'+i for some t! > t + 1. Then for any integer s £ [t, t'}, 

= y~Vj ~ (s ~ t)a t+ 1 = (k - s)a t+ 1. 

2 >s i>t 

We then have 

t / \ 1/2 t 

£ + (* - 
2=1 \2>t / 2=1 

=X>, 172 +(*-»)<#; 

2=1 

s . ( \ 1/2 
= X! ^ 1/2 + Vk-S ( ) ( 8 ) 

i=l \i>s ) 

For any E' such that HE 7 — S|| 2 < S for a small enough 5, we have 

ME 7 ) = ^a i (E') 1 / 2 +yfc^^TO;(S , ) N ) , 

i=l \i>s ) 

for some integer s in [t,t']. Continuity then follows from (|8|) . and the continuity of the square root function. 

□ 


4.3 The Dual of the Ky Fan Norm Minimization Problem 

Our next goal is derive a dual characterization of ((4j |6|), which we will then relate to the spectral lower 
bound SpecLB(fc, A). It is useful to work with the dual, because it is a maximization problem, so to prove 
optimality we just need to show that any feasible solution of the dual gives a lower bound on the optimal 
error under differential privacy. 

The next theorem gives our dual characterization in terms of the special function hk defined in the 
previous section. 

Theorem 4.2. Let A = (a e ) eg ^ £ R® xW be a rank |Q| matrix, and let p be the optimal value of (U) 
Then, 


p 2 = max hk(AQA T ) 2 s.t. 

QhO, diagonal, tr(Q) = 1 


(9) 

( 10 ) 


Since the objective of 13) (0 is not necessarily differentiable, in order to analyze the dual and prove 
Theorem 14.21 we need to recall the concepts of subgradients and subdifferentials. A subgradient of a convex 
function /: S —> M at x £ S, where S is some open subset of R d , is a vector y £ R d so that for every z £ S 
we have 

f{z) > f{x) + ( z-x,y}. 


The set of subgradients of / at x is denoted df{x) and is known as the subdifferential. When / is differentiable 
at x, the subdifferential is a singleton set containing only the gradient V f(x). If / is defined by f(x) = 
fi{x) + f 2 (x), where / 1 , fi : S —> R , then df(x) = dfi(x) + < 9 / 2 ( 2 :). A basic fact in convex analysis is 








that / achieves its minimum at x if and only if 0 £ df(x). For more information on subgradients and 
subdifferentials, see the classical text of Rockafellar IRoc70| . 

Overton and Womersley [OW93] analyzed the subgradients of functions which are a composition of a 
differentiable matrix-valued function with a Ky Fan norm. The special case we need also follows from the 
results of Lewis |Lew95| . 

Lemma 4.2.1 ( [OW93) . ILew95] j. Let gk(X ) = ||X _1 ||( fc ) for a positive definite matrix X £ W rn xrn . Let 
<ji > ... > <j m be the singular values of A' -1 and let D be the diagonal matrix with the a * on the diagonal. 
Assume that for some r > k, <Tk = • • • = oy. Then the subgradients of gu are given by 

dgk{X) = conv{—UsUgX~ 2 UsUg : Uorthonormal, UDU 1 = C [r]}, 


where Us is the submatrix of U indexed by S. 

We use the following well-known characterization of the convex hull of boolean vectors of Hamming 
weight k. For a proof, see [Sch03j . 

Lemma 4.2.2. Let Vk, n — conv{r; £ {0,1}” : ||?;||i = k}. Then Vk, n = {v : |M|i = k, 0 < Vi < 1 Vi}. 

Before we prove Theorem 14.21 we need one more technical lemma. 


Lemma 4.2.3. Let E be an m x m positive semidefinite matrix of rank at least k. Then there exists an 
m x m positive definite matrix X such that E £ — dgk(X ), and gk{X) = ||X -1 ||( fe ) = hk{ E). 


Proof. Let r = rank E, and let ay > ... > ay be the non-zero singular values of E. Let UDU T = E be some 
singular value decomposition of E: U is an orthonormal matrix and D is a diagonal matrix with the oy on 
the diagonal, followed by Os. 

Assume that t, 0 < t < fc, is the integer (guaranteed by Lemma l4.1.3l) for which oy > > <J t+ 1 and 


define a = 


Ei>t 

k—t 


. Since t < k and we assumed E has rank at least fc, we have a > 0. Define 



i < t 

a 

t < i <r 

a — e 

i > r 


and set D' be the diagonal matrix with the a\ on the diagonal. We set e to be an arbitrary number satisfying 
a > e > 0. Let us set X = ( UD'U t) - 1 / 2 . By Lemma 14.2.21 and and the choice of t, the vector {a t +i, ■ ■ ■, oy) 
is an element of the polytope a 14 -t,r-t- Then E is an element of con \{UsUgX~ 2 UsUg : S = [f] U T, T C 
{t + 1,..., r}, \T\ = k — t}. Since this set is a subset of — dgu{X ), we have E £ —dgk(X). A calculation 
shows that ||X -1 ||( fe ) = ||(C/LA'C/ 1 ) 1 / 2 1|(fc) = Y^i<t a l^ 2 + (& — tfa 1 / 2 = hk{ E). This completes the proof. □ 

of Theorem \4-2\ We will use standard notions from the theory of convex duality. For a reference, see the 
book by Boyd and Vandenberghe (BV04 ;. 

Let us define {X : X y 0} to be the domain for the constraints (0 and the objective function (|4|). 
This makes the constraint X y 0 implicit. The optimization problem is convex by Lemma |4.1.2I Is is also 
always feasible: for example for r an upper bound on the Euclidean norm of the longest column of A, 
is a feasible solution. Slater’s condition is therefore satisfied, since the constraints are affine, and, therefore, 
strong duality holds. 

The Lagrange dual function for Jd]) 0 is 

ff(p) = inf llX-^llffc) + J^p e (aJXa e - 1), 


with dual variables p £ p > 0. Equivalently, we can define the diagonal matrix P £ R UxU , P P 0, with 
entries p ee = p ei and the dual function becomes 

g(P) = inf HX- 1 !! w + tr(APA T X) - tr(P) (11) 
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Since the terms ||X _1 ||(fc) and tr(4PAl T X) are non-negative for any X y 0, g{P) > —tr(P) > —oo. There¬ 
fore, the effective domain {P : g(P) > — 00 } of g{P) is {P : P U 0, diagonal}. Since we have strong duality, 
g, 2 = max{g(P) : P h 0, diagonal}. 

By the additivity of subgradients, a matrix X achieves the minimum in (1111) if and only if APA T £ 
— dgk{X ), where g k {X) = ||X _1 ||(fc\. Consider first the case in which APA 1 has rank at least fc. Then, by 
Lemma T4. 2. 31 there exists an A' such that APA T £ —dg k (X) and ||X _1 ||(fc) = h k (APA T ). Observe that, if 
U is an m x fc matrix such that U T U = I and tr(f7 T X _1 [/) = ||AC -1 1|(fc), then 

tr(UU T X~ 2 UU T X) = X~ 2 U)(U J XU)) = tr([PX ~ l U) = ||A- 1 || (fe) . 

Since, by Lemma [4.2.11 and APA T £ —dg k { A'), APA T is a convex combination of matrices UU T X~ 2 UU T 
with U as above, it follows that tr (APA T X) = ||X _1 ||( fc ). Then we have 

g(P) = IIX- 1 !!^) +tr (APA'X) - tr(P) 

= 2||A- 1 || (fe) - tr(P) = 2 hkiAPA 1 ) - tr(P). (12) 

If P is such that APA T has rank less than fc, we can reduce to the rank fc case by a continuity argument. 
Fix any non-negative diagonal matrix P and for A £ [0,1] define P(A) = AP + (1 — A)/. For any A £ [0,1), 
AP(A)A T has rank |Q|, since AA T has rank |Q| by assumption, and, therefore, AP(A)A T y A AAA y 0. 
Then, by Corollary 7.5.1. in [Roc701 . and urn we have 

g{P) = 1WP(A)) = lim \2h k {AP{X)A 1 ) - Atr(P) - (1 - A)|Q|] 

Atl Atl 

= 2h k (APA T ) — tr(P). 

The final equality follows from the continuity of h k , proved in Lemma 14.1.41 

Let us define new variables Q and c, where c = tr(P) and Q = P/c. Because h k is homogeneous 
with exponent 1/2, we can re-write g(P) as g(P) = g(Q,c) = 2^/chk(AQA J ) — c. From the first-order 
optimality condition §f = 0, we see that maximum of g(Q,c) is achieved when c = h k (AQA T ) 2 and is 
equal to hk(AQA J ) 2 . Therefore maximizing g(P) over diagonal positive semidefinite P is equivalent to the 
optimization problem © (USD- Since, by strong duality, the maximum of g{P) is equal to the optimal value 
of ([l])-©, this completes the proof. □ 

4.4 Proof of Theorem 14.11 

Our strategy will be to use the dual formulation in Theorem 14.21 and the restricted invertibility principle to 
give a lower bound on SpecLB(fc, A). First we state the restricted invertiblity principle and a consequence 
of it proved in [NT 15] . 

Theorem 4.3 (lBT87l lSSlO| h Let e £ (0,1), let M be an m x n real matrix, and let W be an n x n diagonal 
matrix such that W y 0 and tr(W) = 1. For any integer k such that fc < e 2 tr(MWM T )/\\MWM T \\2 there 
exists a subset S C [n] of size (iS) = fc such that er m ; n (Ms) 2 > (1 — e) 2 tr(MITM T ). 

For the following lemma, which is a consequence of Theorem 14.31 we need to recall the definition of the 
trace (nuclear) norm of a matrix M: ||M|| tr is equal to the sum of singular values of M. 

Lemma 4.3.1 1 [NT15[ h Let M be an m by n real matrix of rank r, and let W F Q be a diagonal ma¬ 
trix such that tr(W) = 1. Then there exists a submatrix Ms of M, 151 < r, such that |5|CT m in(Afs) 2 > 
c 2 ||MIT 1/,2 || 2 r /(logr) 2 , for a universal constant c > 0. 

of Theorem \4-l\ Given a database size n and a query matrix A , we compute the covariance matrix E as 
follows. We compute a matrix X which gives an (approximately) optimal solution to (© © for fc = [en \, 
and we set E = X -1 . Since 0 © is a convex optimization problem, it can be solved in time polynomial 
in |Q| to any degree of accuracy using the ellipsoid algorithm GLS81] (or the algorithm of Overton and 
Womersley [OW93j ). By Lemma 13.0.21 and the constraints ©, is (e, ^-differentially private with this 

choice of E. 
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By Lemma T3.0.21 


err(>«r,n,A) = o((l + «) 1/2 ).^||E|| w . (1 3, 

By Theorem 14.21 the optimal solution Q of © (TTT71) satisfies 

t / X 1/2 

l|S|| (fc) = h k (AQA T ) = J2 \ 1/2 + Vk^t E > 

i=1 \i>t ) 


where M > ... > A m are the eigenvalues of AQA J and t, 0 < t < k, is an integer such that (k — t)A t > 

J2i>t Ai> (k — t) A t+ i. At least one of J]-=i aE an d Vk — t (£A >t AiE” must be bounded from below by 
i||E||( fc ). Next we consider these two cases separately. 

Assume first that J2i =i A*/ 2 — ^ll^ll(fc)- Let II be the orthogonal projection operator onto the eigenspace 
of AQA T corresponding to Then, because Ai > ... > A t are the nonzero singular values of 

II AQ 1//2 , we have ||IIA(3 1 ' /2 ||tr = A^ 2 > |||£||(fc). By Lemma 14.3.11 applied to the matrices M = II A 

and W = Q, there exists a set S C U of size at most |<Sj < rank II A = t < en, such that 


SpecLB(en, A) > 

> 



Amin (A^ ) 


A m in(nAs) > 


c||nv4Q 1 / 2 || tr > c||S|| (fc) 

(log en)^/\Q\ ~ 2(logen) v / iQ| 


(14) 


for an absolute constant c. 

-1 j r\ 

For the second case, assume that \Jk — t (J2i>t A i) > ^||£||( fe ). Let II now be an orthogonal projection 

operator onto the eigenspace of AQA T corresponding to A t+ i ,... , A m . By the choice of t, we have 


tr(IL4QAII) _ Ei> t Aj 
||IIAQAn|| 2 At+i. - 

By Theorem l4.3l applied with M = II A, W = Q, and £ = \, there exists a set S C U of size j(k—t) < k < en 
so that 


SpecLB 2 (en, A) > 



Amin (As) 


^ / 1^1 \ m \ w ^ *(£,>. AO ^ Ill'll W 

> i/|e,Wn As) >-jyg-> ^ 


The theorem follows from m, the fact that at least one of (fT4l) or m holds, and Theorem 12.1 


(15) 

□ 


5 Conclusion 

Several natural problems remain open. Probably the most important one is to prove results analogous to 
ours for worst case, rather than average, error. In that case the simple post-processing strategy of the 
projection mechanism will likely not be sufficient. Another interesting problem is to remove the dependence 
on the universe size in the competetiveness ratio. It is plausible that this can be done with the projection 
mechanism and a well-chosen Gaussian noise distribution, but we would need tighter lower bounds, possibly 
based on fingerprinting codes as in [BIJV131 . 
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